30 ◾ Bioinformatics
A warning sign is displayed if the observed distribution deviates from normal distribu-
tion by a sum of more than 15% of the reads. A failure sign will be displayed if the distribu-
tion deviates by a sum of more than 30% of reads.
1.5.7 Per Base N Content
During the sequencing process, a base is called with a high confidence. However, for some
fault, the machine may fail to call any base at a specific position. The “N” character is then
placed at that position as an indication of call failure. A few call failures are tolerable;
however, if the frequency of “N” is high, that may pose a quality problem. The per base N
content graph shows the distribution of “N” at each base position. The N percentages are
plotted in the y-axis against positions in the x-axis. A warning is issued if any position
shows an N content of greater than 5% and a failure sign if any position shows an N content
of greater than 20%. Figure 1.21 shows the per base N content with no problem.
1.5.8 Sequence Length Distribution
In the library preparation step, DNA molecules are cut into equal fragments to generate
reads with equal lengths. Most sequencing instruments run quality control to keep the
FIGURE 1.21 Per base N content.